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The threat of a major coronavirus pandemic urges the development of strategies to combat these pathogens. Human coronavirus 
NL63 (HCoV-NL63) is an o-coronavirus that can cause severe lower-respiratory-tract infections requiring hospitalization. 

We report here the 3.4-A-resolution cryo-EM reconstruction of the HCoV-NL63 coronavirus spike glycoprotein trimer, which 
mediates entry into host cells and is the main target of neutralizing antibodies during infection. The map resolves the extensive 
glycan shield obstructing the protein surface and, in combination with mass spectrometry, provides a structural framework to 
understand the accessibility to antibodies. The structure reveals the complete architecture of the fusion machinery including the 
triggering loop and the C-terminal domains, which contribute to anchoring the trimer to the viral membrane. Our data further 
suggest that HCoV-NL63 and other coronaviruses use molecular trickery, based on epitope masking with glycans and activating 
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conformational changes, to evade the immune system of infected hosts. 


Coronaviruses are enveloped viruses with large single-stranded 
positive-sense RNA genomes, classified in four genera (a, B, yand 8). 
In humans, coronaviruses are responsible for 30% of respiratory- 
tract infections!. In addition, coronaviruses have received substantial 
attention in the past decade, owing to the emergence of two deadly 
viruses with tremendous pandemic potential: severe acute respira- 
tory syndrome coronavirus (SARS-CoV) and Middle East respiratory 
syndrome coronavirus (MERS-CoV)?. To date, there are no approved 
antiviral treatments or vaccines for any human coronavirus. 

Coronaviruses are zoonotic viruses, and surveillance studies have 
suggested that both SARS-CoV and MERS-CoV originated from bats 
and that camels are also likely hosts for MERS-CoV>4. Moreover, 
sequencing data have demonstrated that bats serve as a reservoir of 
coronaviruses that have the potential to cross the species barrier and 
infect humans. This phenomenon is illustrated by the observation 
that substitution of three amino acid residues in the spike (S) glyco- 
protein receptor-binding domain of the bat-infecting HKU4-CoV 
enhances its affinity for human DPP4 (the MERS-CoV receptor) 
by two orders of magnitude, In addition, substitution of two other 
residues enables processing by human proteases and allows the 
HKU4-CoV S protein to mediate entry into human cells”. As a result, 
cross-species transmission of coronaviruses poses an imminent and 
long-term threat to human health. Recombination with coronaviruses 
frequently involved in mild respiratory infections may potentially 
lead to the emergence of highly pathogenic viruses*. Understanding 
the pathogenesis, cross-species transmission and recombination of 
coronaviruses is crucial to prevent or control their spread in humans 
and to evaluate the potential for long-term emerging diseases. 


To date, a- and B-coronavirus genera have been implicated 
in human diseases and zoonoses. The human coronavirus NL63 
(HCoV-NL63) is an &-coronavirus that is genetically distinct from 
the B-coronaviruses mouse hepatitis virus (MHV, the prototypical 
coronavirus), MERS-CoV and SARS-CoV, and was first isolated 
from a 7-month-old patient with a respiratory-tract infection®?. 
Further studies have revealed that HCoV-NL63 infections appear to 
be common in childhood, and most adult sera contain antibodies 
that neutralize the virus®!°. HCoV-NL63 is a major cause of bronchio- 
litis and pneumonia in newborns worldwide and can cause severe 
lower-respiratory-tract infections that require hospitalization, 
especially among young children, the elderly and immunocom- 
promised adults!!, HCoV-NL63 infections have been reported in 
countries across Europe, Asia and North America, thus indicat- 
ing its circulation among the human population worldwide. Other 
Q-coronaviruses related to the human respiratory pathogen 
HCoV-229E have recently been identified in camels co-infected 
with MERS-CoV4, an observation further underscoring the impor- 
tance of characterizing this coronavirus genus. Additionally, the 
emergence of the highly lethal porcine epidemic diarrhea coronavirus 
(PEDV, o-genus) has recently had devastating consequences for the 
US swine industry!?. 

Coronaviruses use S homotrimers to promote cell attachment and 
fusion of the viral and host membranes. Because it is virtually the only 
antigen present at the virus surface, S is the main target of neutralizing 
antibodies during infection and a focus of vaccine design!*. $ is a class I 
viral fusion protein that is synthesized as a single-chain precursor of 
~1,300 amino acids and trimerizes after folding!*. It is composed of 
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an N-terminal S, subunit, containing the receptor-binding domain, 
and a C-terminal S, subunit, driving membrane fusion. After virion 
uptake by target host cells, cleavage at the S,’ site (next to the putative 
fusion peptide) is required for fusion activation of all coronavirus 
S proteins, so that they can subsequently transition to the postfusion 
conformation!>-!”, 

Our previously reported cryo-EM reconstruction of the MHV S 
glycoprotein at 4.0-A resolution reveals the prefusion architecture 
of the machinery mediating entry of B-coronaviruses into cells!*. 
It also demonstrates that coronavirus S and paramyxovirus 
F proteins share a common evolutionary origin. Here, we set out to 
characterize the conservation of the 3D organization of spike pro- 
teins among coronaviruses belonging to different genera. We report 
the atomic-resolution structure of the pathogenic HCoV-NL63 
S-glycoprotein trimer, which belongs to the &-coronavirus genus. 
The substantial resolution improvement as compared with earlier 
studies allows visualization of the S glycoprotein at an unprece- 
dented level of detail, which is a prerequisite for guiding drug and 
vaccine design, and reveals both shared and unique features of the 
a-genus of human pathogens. Our results suggest that HCoV-NL63 
and other coronaviruses use molecular trickery, based on epitope 
masking with glycans and activating conformational changes, to 
evade the immune system of infected hosts, in a manner similar 
to that described for HIV-1. 


RESULTS 

Structure determination 

We used Drosophila S2 cells to produce the HCoV-NL63 S ectodomain 
N-terminally fused to a GCN4 trimerization motif downstream from 
the heptad-repeat 2 (HR2) helix. We imaged frozen-hydrated HCoV- 
NL63 S ectodomain particles with an FEI Titan Krios electron micro- 
scope equipped with a Gatan Quantum GIF energy filter operated 


MODE 


Table 1 Data collection and refinement statistics 
Data collection 


Number of particles 79,667 
Pixel size (A) 1.36 
Defocus range (um) 2-4 
Voltage (kV) 300 
Electron dose (e-/A2) 48 
Refinement 

Resolution (A) 3.4 
Map-sharpening B factor (A2) -129 
Model validation 

Favored rotamers (%) 97.87 
Poor rotamers (%) 0.68 
Ramachandran allowed (%) 99.32 
Ramachandran outliers (%) 0.6 
Clash score 3.3 
MolProbity score 1.54 


in zero-loss mode, with a slit width of 20 eV, and a Gatan K2 Summit 
electron-counting camera!® (Online Methods). 

We determined a 3D reconstruction of the HCoV-NL63 S at 
3.4-A resolution, using the gold-standard Fourier shell correlation 
(FSC) criterion of 0.143 (refs. 20,21) (Fig. 1 and Supplementary 
Fig. 1). The final model, which we built and refined with Coot? 
and Rosetta*?-*5, includes residues 23 to 1224, with internal breaks 
between residues 110-121, 882-890 and 992-1001 (Supplementary 
Fig. 1 and Table 1). The HCoV-NL63 S ectodomain is a 160-A-long 
trimer with a triangular cross-section. 


The ordered glycan shield 
A notable feature of this structure is the extraordinary number of 
N-linked oligosaccharides that cover the spike trimer. In the cryo-EM 


Viral membrane 


Figure 1 Cryo-EM structure of the HCoV-NL63 S trimer. (a) Representative micrograph of frozen-hydrated HCoV-NL63 S particles (defocus 3.4 um). 
Scale bar, 355 A. (b) Five selected class averages showing the particles along different orientations. Scale bar, 60 A. (c,d) 3D map filtered at 3.4-A 
resolution and colored by protomer. Two orthogonal views of the S trimer from the side (c) and from the top, facing toward the viral membrane, (d) are 
shown. (e,f) Ribbon diagrams showing the HCoV-NL63 S atomic model, oriented as in c and d, respectively. 
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Figure 2 Organization of the HCoV-NL63 S-protein glycan shield, revealed by cryo-EM and MS. (a,b) Ribbon diagrams showing two orthogonal views 

of the S trimer, from the side (a) and from the top (b), facing toward the viral membrane. Glycans are shown as dark-blue spheres. (c) Residue-level 
schematic of N-linked glycans. The most extensive glycan structure detected by MS at each site is represented except for glycans observed only by 
cryo-EM, for which the resolved sugar moieties are shown. FP, fusion peptide; HR1, heptad-repeat 1 region; HR2, heptad-repeat 2 region (shown with a 
dashed line because it is not resolved in the map); TM, transmembrane domain (the striated texture indicates regions that are not part of the construct); 


GlcNac, N-acetylglucosamine; Man, mannose; Fuc, fucose. 


reconstruction, we observed density for 31 N-linked glycans 
extending tangentially relative to the protein surface (Fig. 2a,b, 
Supplementary Fig. 1 and Supplementary Table 1). At least the 
two core N-acetylglucosamine moieties are visible for the majority 
of glycosylation sites. 

Using on-line reversed-phase liquid chromatography with elec- 
tron transfer/high-energy collision-dissociation tandem MS?®, we 
detected 25 N-linked glycosylation sites overlapping with those 
observed in the cryo-EM map and identified three additional sites 
(Fig. 2c, Supplementary Fig. 2 and Supplementary Table 1). We 
identified these sites from both intact glycopeptides and peptides with 
the glycan trimmed down to the N-linked core N-acetylglucosamine 
moiety. The cryo-EM and MS data together provide evidence for 
glycosylation at 34 out of 39 possible NXS/T glycosylation sequons. 
The intact glycopeptides detected by MS/MS for HCoV-NL63 S 
expressed in Drosophila S2 cells corresponded to either paucimanno- 
sidic glycans containing three mannose residues (with or without 
core fucosylation) or high-mannose glycans containing four to nine 
mannose residues. Although glycan processing differences exist 
between insect and mammalian cell expression systems, the same 
glycosylation sequons are expected to be recognized and glyco- 
sylated in both cases. Previous reports have suggested that several 
coronavirus S glycans are of the high-mannose type, as a result of 
direct budding from the endoplasmic reticulum-Golgi intermediate 
compartment?”8, thus supporting the biological relevance of the 
potential glycan structures identified. 

In the refined model, N-linked glycans cover a substantial amount 
of the accessible surface of the trimer (Fig. 2a,b). The higher glycan 
density per accessible surface area detected for the S, subunits 
(819 A2/glycan) compared with the S, subunits (1,393 A2/glycan) may 
explain why most coronavirus neutralizing antibodies isolated to date 
target the latter region. Because many of the observed glycosylation 
sites are topologically conserved among coronavirus S proteins, we 
suggest that the glycan footprint observed here may be representa- 
tive of those of other S proteins. Besides potentially contributing to 
immune evasion, as discussed below, S glycans have been proposed 
to play a role in host-cell entry”? via L-SIGN lectin, which is an 
alternative receptor for SARS-CoV*? and HCoV-229E?’. 


Structure of the S,’ trigger loop 

The HCoV-NL63 and MHV S) fusion machineries are structurally 
similar and can be superimposed with excellent agreement (Fig. 3a and 
Supplementary Fig. 3; DALI#! Z score 29.7, r.m.s. deviation 2.2 A over 
312 residues). In contrast to our previous MHV S structure!8, most of 
the HCoV-NL63 S,’ trigger loop, which connects the upstream helix 
to the fusion peptide and participates in fusion activation, is resolved 
in the reconstruction (Fig. 3b). The trigger loop runs almost perpen- 
dicularly to the long axis of the S, subunit and forms three helical seg- 
ments before looping back to connect to the fusion peptide. Multiple 
arginine residues, forming two putative furin-cleavage sites, are present 
in the C-terminal region of the S,’ loop (863-RNIRSSR-870), which is 
characterized by weaker density, as would be expected from a protease- 
sensitive polypeptide segment. These observations are consistent with 
results of previous studies suggesting that fusion activation of the 
HCoV-NL63 S glycoprotein occurs after S,’ proteolytic processing at 
the plasma membrane (by trypsin-like proteases such as TMPRS2) or 
in the endosomal pathway (by furin or cysteine proteases) !>?. 

The lack of strict amino acid sequence conservation at the S,’ 
cleavage site among coronavirus S proteins reflects the usage of 
different proteases found in distinct cellular compartments for fusion 
activation!>-!”, Similarly to the additional cleavage site present between 
the S, and S, subunits of MERS-CoV’, the multiple glycans present in 
the vicinity of the S,’ loop probably further influence protease sensi- 
tivity (Fig. 3b). However, we emphasize that S,’ processing occurs at 
topologically equivalent positions for HCoV-NL63 S, MERS-CoV S, 
MHV Sand probably most coronavirus S glycoproteins. 


Anchoring of the fusion machinery to the viral membrane 

The HCoV-NL63 S reconstruction (Fig. 3a) resolves a large part 
of the S, C-terminal region that has not been observed in previous 
studies!®33, We were able to build an atomic model for the connector 
domain and the stem helix, which connect to the HR2 region. The 
connector folds as a B-rich domain decorated with one short o-helix. 
Atits C-terminal end, the polypeptide chain folds as an o-helix (stem helix, 
Fig. 3a,c,d) aligned along the three-fold molecular axis, which 
turns into the HR2 domain, corresponding to 71 additional resi- 
dues not resolved in our map. In the trimer, the connector domains 
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Figure 3 Architecture of the complete coronavirus fusion machinery. (a) Ribbon diagram of the S» trimer, colored by protomer with glycans 

rendered as dark-blue spheres. (b) Zoomed-in view of the S,.’ trigger-loop region comprising the central helix and the fusion peptide (light blue). 
N-linked glycans are shown as dark-blue spheres. The polypeptide segment corresponding to the putative cleavage site is poorly resolved in 

the density, and this part of the model should be considered to be hypothetical. (c,d) Ribbon diagrams showing two orthogonal views of the 

So’ C-terminal region, which is assembled from the connector domains and stem helices. (e,f) Ribbon diagrams of the HCoV-NL63 So» subunit (e) and 
of the RSV F protein (f). Conserved structural elements are colored identically to highlight the similar 3D organization of these two fusion machineries, 
whereas nonconserved regions are colored gray. The topology diagrams underscore the similar topology of the HCoV-NL63 S connector domain 

and the equivalent RSV F domain, although the tertiary structures of these domains are different, and several structural motifs have been added 

to the latter domain throughout evolution. The RSV F secondary-structural elements are annotated according to ref. 34. The N- and C-terminal 


extremities of the polypeptide segments are indicated. 


assemble as a cup flanking the viral membrane-proximal side of 
the ectodomain, and the stem helices form a bundle stabilized by 
hydrophobic interactions. 

The coronavirus S connector domain and the equivalent para- 
myxovirus F domain share a related topology, although their tertiary 
structures are different, and several structural motifs have been added 
to the latter domain throughout evolution*4> (Fig. 3e,f). Moreover, 
the trimer of stem helices assembles as a helical bundle, which initi- 
ates the HR2 domain in a manner reminiscent of the heptad repeat 
B (HRB) region of paramyxovirus prefusion F structures*+*>. These 
observations lend further support to the evolutionary connection that 
we have previously proposed for the fusion machineries of these two 
viral families!®. 

Comparison of the prefusion HCoV-NL63 S, subunit with the 
structure of the postfusion core suggests that the C-terminal region 


of the connector domain and the stem helix must refold and/or change 
conformation to yield the canonical ‘trimer of hairpin conformation 
that mediates fusion of the host and viral membrane in all class I 
fusion proteins!®3637, 


Duplication of the N-terminal domain in o-coronaviruses 

The HCoV-NL63 S structure shows the presence of an additional 
N-terminal domain not present in B-coronaviruses. Phylogenetic 
analyses suggest that this is a canonical feature of most &-coronavirus 
S glycoproteins (Fig. 4a—-c). This domain, which we named domain 
0, adopts a galectin-like $-sandwich fold supplemented with a 
three-stranded B-sheet, similarly to domain A (Fig. 4d-f, DALI 
Z score 6.7, r.m.s. deviation 3.8 A over 147 residues), thus suggesting 
a gene-duplication event. Domain 0 interacts with the viral-membrane- 
proximal side of domain A and with domain D. 
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Figure 4 Evolution of the a-coronavirus S-glycoprotein fold appears to correlate with tissue tropism. (a) Schematic representation of several 
a-coronavirus S-glycoprotein S1 subunits, highlighting the presence of one or several domains O (blue), as compared with B-coronaviruses. HCoV-NL63 
(GenBank YP_003767.1), 229-rel. CoV 1 (GenBank ALK28775.1), 229-rel. CoV 2 (GenBank ALK28765.1), HCoV-229E (GenBank NP_O73551.1), 
porcine epidemic diarrhea virus (PEDV; GenBank AAK38656.1), transmissible gastroenteritis virus strain Purdue P115 (TGEV; GenBank ABG89325.1), 
porcine respiratory coronavirus strain ISU-1 (PRCV; GenBank ABG89317.1), feline enteric coronavirus strain UU23 (FECV-UU23; GenBank 
ADC35472.1) and feline infectious peritonitis coronavirus strain UU21 (FIPV-UU21; GenBank ADL71466.1). The B-coronavirus MHV S, subunit is 
shown for comparison. Domains A-D are indicated for MHV and HCoV-NL63. (b) Ribbon diagram of the HCoV-NL63 S, subunit. (c) Ribbon diagram 

of the MHV S, subunit. (d-g) Ribbon diagrams of HCoV-NL63 domain O (d), domain A (e), MHV domain A (f) and rotavirus VP8* (g), showing their 
structural similarity, which suggests common ancestry. HCoV-NL63 domain O and A probably arose from a duplication event. 


We determined that domain 0 is also structurally similar to the 
VP8* sialic acid-binding domain of the rotavirus VP4 spike protein*® 
(Fig. 4g; PDB 1KQR, DALI Z score 8.9, r.m.s. deviation 3.0 A over 
109 residues). In line with this finding, domain 0 of transmissible 
gastroenteritis coronavirus (TGEV) and of PEDV bind to sialic acid, 
and deletion of this domain in o-coronavirus S appears to correlate 
with a loss of enteric tropism>?. We detected no sialic acid binding 
activity for the HCoV-NL63 S; subunit (Supplementary Fig. 4), thus 
possibly explaining the strict respiratory tropism of this virus. Instead, 
host-cell heparan sulfate proteoglycans have been shown to participate 
in HCoV-NL63 anchoring and infection*®, and we detected binding 
of heparan sulfate to the HCoV-NL63 S protein by using surface plas- 
mon resonance (SPR) (Supplementary Fig. 5a). We hypothesize that 
these interactions may be mediated either by domain 0, which exhib- 
its several positively charged patches on its surface (Supplementary 
Fig. 5b), or domain A, which has been reported to bind carbohydrates 
in the case of a bovine coronavirus*!. 


A putative immune-evasion strategy 

Domain B, which is the HCoV-NL63 receptor-binding domain, 
exhibits a structure distinct from those of B-coronavirus B domains, 
although a topological relatedness has been detected among these 
B-rich domains”. Superimposition of the HCoV-NL63 and MHV 
S; subunits highlights that their B domains feature opposite orienta- 
tions related by an ~180° rotation (Fig. 5a,b). As a result, many of the 
HCoV-NL63 receptor-binding residues are buried through interac- 
tion with domain A of the same protomer, are masked by the glycan at 
residue Asn358 and are not available to engage the host-cell receptor 
(human angiotensin-converting enzyme 2, ACE2). Comparison of 
the HCoV-NL63 domain-B structure in our cryo-EM-derived model 
with the crystal structure of the same domain in complex with ACE2 
(ref. 43) revealed that the receptor-binding loop containing residues 
531-539 undergoes substantial conformational changes after binding 
(and is defined by weak density; Fig. 5c). These findings explain the 


markedly higher ACE2 binding affinity of HCoV-NL63 domain B, 
compared with that of the full-length S$; domain (Fig. 5d). 

Because the receptor-binding loops elicit potent neutralizing anti- 
bodies in the case of TGEV44, MERS-CoV* and SARS-CoV4°?, 
we speculate that HCoV-NL63 has evolved to limit exposure of this 
vulnerable site to B-cell receptors via protein-protein interactions 
and glycan masking. This mechanism is reminiscent of the HIV-1 
immune evasion strategy, which relies on a glycan shield and confor- 
mational changes that are triggered by binding of CD4 and expose the 
chemokine-receptor-interacting motifs°°>!. 


DISCUSSION 

Viruses have evolved several immune-evasion strategies including 
rapid antigenic evolution, masking of epitopes and exposure of non- 
neutralizing immune-dominant ‘decoy’ epitopes. For example, HIV-1 
(ref. 52), Lassa virus®, hepatitis C virus°4 and Epstein-Barr virus®° 
exhibit extensive N-linked glycosylation, covering exposed protein 
surfaces, with glycan masses that may exceed that of the protein com- 
ponent. The HCoV-NL63 S trimer is covered by an extensive glycan 
shield consisting of 102 N-linked oligosaccharides obstructing the 
protein surface. This observation is reminiscent of descriptions of the 
HIV-1 envelope trimer*’, although the glycan density is 30% higher 
in the latter case. Furthermore, our data suggest that, similarly to 
HIV-1, coronavirus S glycans mask the protein surface and conse- 
quently limit access to neutralizing antibodies and thwart the humoral 
immune response. This strategy is illustrated by the presence of a gly- 
can linked to Asn358 in the HCoV-NL63 structure reported here. This 
glycan, along with the proteinaceous moiety of domain A, contrib- 
utes to masking the receptor-binding loops, which have been shown 
to elicit potent neutralizing antibodies for other coronaviruses*+” 
and appear to represent a potential ‘Achilles’ heel’ of these viruses. 
This hypothesis is further supported by the observation of three 
additional glycans directly protruding from the viral-membrane- 
distal side of domain B. As a result, conformational changes are 
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Figure 5 Potential immune-evasion strategy used by HCoV-NL63. (a) Ribbon diagram of the HCoV-NL63 S trimer, highlighting the conformation of the 
S subunit. Domains O, A, B, C and D are colored for one protomer. (b) The HCoV-NL63 receptor-binding loops are buried via interactions with domain A 
of the same protomer (including the glycan moiety at Asn358) and are not available to engage host-cell receptors. Superimposition of the HCoV-NL63 
(purple) and MHV (light gray) S; subunits via their C domains highlights that their B domains feature opposite orientations related by an ~180° rotation, 
thus suggesting a putative trajectory for the conformational changes that must occur to engage the host-cell receptor. Only domain B is shown for MHV S. 
(c) Comparison of the HCoV-NL63 domain-B structure in our cryo-EM-derived model (purple) with the crystal structure of the same domain in complex 


Viral membrane 


with ACE2 (green and dark gray), showing that the receptor-binding loop containing residues 531-539 substantially changes its conformation after 
binding. (d) ACE2 binding ELISA showing that isolated HCoV-NL63 domain B (HCoV-NL63 S,-B-mFc) binds ACE2 with higher affinity than does the 
full-length S; domain (HCoV-NL63 Sj-mFc). SARS-CoV S; (HCoV-NL63 Sj-mFc) is a positive control. HCoV-NL63 S; domain O (HCoV-NL63 S,-O0-mFc) 
and PEDV S, (PEDV S;-mFc), which do not bind ACE2, are negative controls. Mean values and s.d. of three independent experiments are shown. 


required for the HCoV-NL63 S glycoprotein to be able to interact 
with ACE2 (ref. 43). These rearrangements and/or receptor binding 
are likely to participate in initiating the fusion reaction by disrupting 
the interactions formed between domain B and the HR1 C-terminal 
region. Interactions with heparan sulfate proteoglycans present at the 
host-cell surface might potentially contribute to activating HCoV-NL63 S 
and promote subsequent interactions with ACE2. A common theme 
arising from the analysis of - and B-coronavirus S-glycoprotein 
structures is that domain-B-mediated host anchoring involves major 
structural rearrangements that expose the binding motifs!*°, 

Visualization of the glycan shield obstructing access to the S surface 
and deciphering the molecular trickery used by some coronaviruses 
provide a rational basis for understanding the accessibility to neu- 
tralizing antibodies and may pave the way for guiding future design 
of immunogens or therapeutics. We have previously suggested that 
targeting the fusion machinery bears the promise of finding broadly 
neutralizing inhibitors of coronavirus infection!®, and the high den- 
sity of glycans decorating this region will need to be taken into con- 
sideration to increase the likelihood of success. 


METHODS 
Methods and any associated references are available in the online 
version of the paper. 


Accession codes. The cryo-EM map has been deposited in the 
Electron Microscopy Data Bank under accession code EMD-8331; 
the corresponding atomic model has been deposited into the Protein 
Data Bank under accession code PDB 5SZS. The MS data (includ- 
ing the raw data, COMET search results and annotated tandem 
MS spectra of all accepted glycopeptide identifications) have been 
deposited in the proteomics identifications (PRIDE) database under 
dataset PXD004557. 


Note: Any Supplementary Information and Source Data files are available in the online 
version of the paper. 
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ONLINE METHODS 

Plasmids. A gene fragment encoding the HCoV-NL63 S ectodomain (residues 16- 
1291, UniProt Q6Q1S2) was PCR-amplified from a plasmid containing the full- 
length S gene. The PCR product was ligated to a gene fragment encoding a GCN4 
trimerization motif (LIKRMKQIEDKIEEIESKQKKIENEIARIKKIK)!835.56, 
a thrombin-cleavage site (LVPRGSLE), an eight-residue-long Strep-Tag 
(WSHPQFEK) and a stop codon. Subsequent cloning was performed in the 
pMT-BiP-V5-His expression vector (Invitrogen) in frame with the Drosophila 
BiP secretion signal downstream the metallothionein promoter. 


Production of recombinant HCoV-NL63 S ectodomain in Drosophila S2 cells. 
To generate a stable Drosophila S2 cell line expressing the recombinant HCoV- 
NL63 S ectodomain, we used Effectene (Qiagen) and 2 lg of plasmid. Puromycin 
N-acetyltransferase was cotransfected and used as a dominant selectable marker. 
Stable HCoV-NL63 S-expressing cell lines were selected by addition of 7 ug/ml 
puromycin (Invivogen) to the culture medium 48 h after transfection. For large- 
scale production, the cells were cultured in spinner flasks and induced by 5 uM 
of CdCl, at a density of approximately 107 cells/mL. After one week at 28 °C, 
clarified cell supernatants were concentrated 40-fold with Vivaflow tangential 
filtration cassettes (Sartorius, 10-kDa cutoff) and adjusted to pH 8.0, before 
affinity purification with a StrepTactin Superflow column (IBA) followed by 
gel-filtration chromatography with a Superose 6 10/300 GL column (GE Life 
Sciences) equilibrated in 20 mM Tris-HCl, pH 7.5, and 100 mM NaCl. The puri- 
fied protein was quantified according to absorption at 280 nm and concentrated 
to approximately 3 mg/mL. 


Cryo-EM specimen preparation and data collection. 2 ul of purified HCoV- 
NL63 spike at 1.0 mg/mL was applied to a 1.2/1.3 C-flat grid (Protochips), which 
had been glow-discharged for 30 s at 20 mA. Grids were then plunge-frozen in 
liquid ethane with an FEI Mark I Vitrobot with 7.5-s blot time and an offset of 
—3 mm at 100% humidity and 25 °C. Data were collected with Leginon automatic 
data-collection software* on an FEI Titan Krios operated at 300 kV and equipped 
with a Gatan Quantum GIF energy filter, operated in zero-loss mode with a slit 
width of 20 eV, and a Gatan K2 Summit direct electron detector camera. The dose 
rate was adjusted to 8 counts/pixel/s, and each movie was acquired in counting 
mode fractionated in 50 frames of 200 ms. 1,400 micrographs were collected in 
a single session with a defocus range between 2.0 and 4.0 um. 


Cryo-EM data processing. Whole-frame alignment was carried out with 
DOSEFGPU DRIFTCORR”. The parameters of the microscope contrast-transfer 
function were initially estimated with CTFFIND4 (ref. 58) and then with GCTF*”. 
Micrographs were manually masked with Appion® to exclude the visible carbon 
edge from images. Particles were automatically picked with DoGPicker®!. Particle 
images were extracted and processed with Relion 1.4 (ref. 62) with a box size of 
320 pixels? and a pixel size of 1.36 A. After reference-free 2D classification, we 
retained 180,000 out of 474,000 particles to run 3D classification with C1 symme- 
try®. We used the initial model previously generated for MHV!8 with Optimod®™ 
and low-pass-filtered the data to 60 A asa starting reference for 3D classification. 
118,000 particles were selected and used to run gold-standard 3D refinement with 
Relion”®, thus yielding a map at 3.95-A resolution. After particle-motion and 
radiation-damage correction with Relion particle polishing™, another round of 
3D classification with C3 symmetry was performed to select 79,667 particles. After 
gold-standard 3D refinement with this subset of particles, we obtained a recon- 
struction at 3.76-A resolution. Per-particle defocus parameters were estimated 
with GCTF and used to run an identical round of 3D refinement that yielded the 
final 3.4-A-resolution map. Post processing was performed with Relion to apply 
an automatically generated B factor of -129 A. Reported resolutions were based 
on the gold-standard FSC = 0.143 criterion?®!, and FSC curves were corrected 
for the effects of soft masking by high-resolution noise substitution®. The soft 
mask used for FSC calculation had a 10-pixel cosine-edge fall-off. 


Model building and analysis. UCSF Chimera®® and Coot?” were used to fit 
atomic models into the cryo-EM map. The MHV S) subunit was fit into the den- 
sity and rebuilt manually in Coot. The crystal structure of HCoV-NL63 domain 
B was then fit into the density, and the rest of the S; subunit was built with a 
combination of manual building in Coot and de novo building with Rosetta?>-?°. 
Glycan density coming after an NXS/T motif was initially manually built into 


the density, and glycan geometry was then refined with Rosetta, optimizing the 
fit-to-density as well as the energetics of protein-glycan contacts. The glycans 
were not as well defined as the protein region in the reconstruction, owing to 
flexibility and compositional heterogeneity. The final model was refined by 
application of strict noncrystallographic symmetry constraints with Rosetta, with 
a training map corresponding to one of the two maps generated by the gold- 
standard refinement procedure in Relion. The second map (testing map) was used 
only for calculation of the FSC compared with the atomic model and preventing 
overfitting®’. The quality of the final model was analyzed with MolProbity® 
and Privateer”. Structure analysis was performed with the DALI server?! and 
areaimol’!. Electrostatic-potential calculations were performed with PDB2PQR”” 
and APBS7*, All figures were generated with UCSF Chimera®. Local resolution 
estimation was performed with Resmap”4. 


Mass spectrometry. HCoV-NL63 S was prepared for MS analysis unaltered or 
subjected to Endo H (NEB), subjected to Endo F3 (Millipore) or subjected to 
combined Endo H and Endo F3 deglycosylation treatment. 2 ul of the relevant 
endoglycosidases was incubated with 20 [1g of HCoV-NL63 S for 14 h overnight 
in 50 mM sodium acetate, pH 4.4, at 37 °C in a 20-UL reaction. 6 ug of HCoV- 
NL63 S was then incubated in a freshly prepared solution containing 100 mM 
Tris, pH 8.5, 2% sodium deoxycholate, 10 mM Tris(2-carboxyethyl)phosphine 
and 40 mM iodoacetamide at 95 °C for 5 min; this was followed by an incuba- 
tion at 25 °C for 30 min in the dark. 1.6 1g of denatured, reduced and alkylated 
HCoV-NL63 S was then diluted into freshly prepared 50 mM ammonium 
bicarbonate and incubated for 14h at 37 °C with 0.032 1g of either trypsin (Sigma 
Aldrich) or chymotrypsin (Sigma Aldrich). Formic acid was then added to a 
final concentration of 2% to precipitate the sodium deoxycholate in the sam- 
ples. Samples were then centrifuged at 14,000 r.p.m. for 20 min. The supernatant 
containing the (glyco)peptides was collected and spun again at 14,000 r.p.m. 
for 5 min immediately before sample analysis. Between 4 and 7 UL was run 
on a Thermo Scientific Orbitrap Fusion Tribrid mass spectrometer. A 35-cm 
analytical column and a 3-cm trap column filled with ReproSil-Pur C18AQ 5 um 
(Dr. Maisch) beads were used. Nanospray LC-MS/MS was used to separate 
peptides over a 110-min gradient from 5% to 30% acetonitrile with 0.1% formic 
acid. A positive spray voltage of 2,100 was used with an ion-transfer-tube 
temperature of 350 °C. An electron-transfer/higher-energy collision dissocia- 
tion ion-fragmentation scheme”® was used with calibrated charge-dependent 
ETD parameters and a supplemental higher-energy collision dissociation 
energy of 0.15 for the samples with intact glycopeptides and 0.2 for the 
samples treated with endoglycosidases. A resolution setting of 120,000 with an 
AGC target of 2 x 105 was used for MS1, and a resolution setting of 30,000 with 
an AGC target of 1 x 10° was used for MS2. The data were searched against a 
custom database including recombinant coronavirus S-glycoprotein sequences, 
a list of common contaminant proteins including trypsin, chymotrypsin 
and the endoglycosidases, as well as 998 decoy reverse yeast sequences, with 
trypsin or chymotrypsin as the protease, allowing up to two missed cleavages. 
All searches included carbamidomethylation of cysteine as a fixed modi- 
fication and oxidation of methionine as a variable modification. An initial 
comprehensive search for glycosylation revealed that (core-fucosylated) 
paucimannose and high-mannose structures were the only identified glycan 
species in the samples. On the basis of these findings, a final search was performed 
with COMET” on the same data with the following list of variable modifica- 
tions of asparagine residues: +HexNAc(2)Hex(3), +HexNAc(2)Hex(3)dHex(1), 
+HexNAc(2)Hex(3)dHex(2), +HexNAc(2)Hex(4), +HexNAc(2)Hex(5), 
+HexNAc(2)Hex(6), +HexNAc(2)Hex(7), +HexNAc(2)Hex(8) and 
+HexNAc(2)Hex(9). The samples treated with endoglycosidases were searched 
with +HexNAc, +HexNAc(1)dHex(1) and +HexNAc(1)dHex(2) as variable 
modifications of asparagine. We used a precursor mass tolerance of 20 p.p.m., 
0.02 fragment bin size, including b/c/y/z fragments, with monoisotopic masses 
for both precursor and fragment ions. The search results were filtered for 
modification of asparagine residues and the presence of an NX(S/T) sequon 
at the protein level. All appropriate peptide spectrum matches (PSMs) were 
manually inspected, and only those with reasonable peptide sequence cov- 
erage were kept. In addition, the spectra were inspected for the presence 
of glycan fragment ions. All glycosylation sites identified by MS listed in 
Supplementary Table 1 are based on multiple PSMs, often with multiple differ- 
ent glycans and additional confirmation from overlap between the trypsin- and 
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chymotrypsin-treated samples. The greatest number of glycopeptide identifica- 
tions was made in the chymotrypsin-digested samples. 


Hemagglutination assay. The S, subunit of HCoV-NL63 C-terminally tagged 
with the Fc portion of human IgG (S,-Fc) was tested alone or premixed with 1 ul 
of Protein A-coupled, 200-nm-sized nanoparticles (nano-screenMAG- Protein 
A beads; Chemicell, cat.no. 4503-1) to increase the avidity of S,-Fc proteins for 
sialic acids on the erythrocyte surface. The sialic acid-binding S, subunit of 
PEDV (strain GDU, GenBank AFP81695.1) C-terminally fused to the human 
Fc portion was used as a positive control. ‘Mock indicates the conditions in 
which no S; subunit was used (negative control). The initial concentration of 
S)-Fe was 5 ug, and two-fold serial dilutions of S|-Fc-nanoparticle mixtures 
were made in 50 tl phosphate-buffered saline supplemented with 0.1% bovine 
serum albumin. 50 ul erythrocyte suspension (0.5%) was mixed with 50 ul of 
S,-Fe-nanoparticle dilution in V-shaped 96-well plates and incubated for 2 h on 
ice, after which the wells were photographed. 


Protein expression of S, variants and ACE2. Different S; variants of HCoV- 
NL63 S protein, including S, (residues 1-718), S; domain 0 (S-0, residues 1-209) 
and S,; domain B (S,-B, residues 481-616), were C-terminally fused to the Fc 
region of mouse IgG (mFc), expressed in HEK-293T cells and affinity purified 
as previously described”®, Likewise, an S,-mFc expression plasmid was made for 
the SARS-CoV S, domain (isolate CUHK-W1, residues 1-676) and the PEDV 
S; domain (strain GDU; residues 1-728). Expression of the human angiotensin- 
converting enzyme ectodomain (ACE2; residues 1-614) fused to the Fc portion 
of human IgG (hFc) was performed as previously described”®. 


ACE2 binding ELISA. The ability of the HCoV-NL63 S;-mFc and S,-B-mFc chi- 
meric proteins to bind the ACE2-hFc receptor was evaluated with an ELISA-based 
assay. 100 Ll of hACE2-hFc (20 g/ml, diluted in PBS) was coated on a 96-well 
MaxiSorb plate overnight at 4 °C. Nonspecific binding sites were subsequently 
blocked with a 3% (w/v) solution of bovine serum albumin in PBS. Plates were 
washed with washing buffer (PBS with 0.05% Tween 20) and subsequently incu- 
bated with serially diluted S,-mFc proteins (starting with equimolar concentra- 
tions) for 1 h at room temperature, after which plates were washed three times 
with washing buffer. mFc-tagged S, proteins were detected with HRP-conjugated 
polyclonal rabbit-anti-mouse immunoglobulins (1:2,000 dilution in PBS with 
0.1% BSA; DAKO, P0260; validation on manufacturer’s website), and a colorimet- 
ric reaction was produced after incubation with tetramethylbenzidine substrate 
(BioFX). The optical density (OD) was subsequently measured at 450 nm with an 
ELISA reader (EL-808, BioTEK). Background (signal from HRP-conjugated anti- 
mFc antibody alone) was subtracted from the OD4s5nm Values. The mFc-tagged 
SARS-CoV S, subunit was used as a positive control, whereas the mFc-tagged 
HCoV-NL63 S; domain 0 (HCoV-NL63 S,-0-mFc) and PEDV S, subunit (PEDV 
S;-mEFc), both of which do not bind ACE2, were used as negative controls. 


Surface plasmon resonance (SPR). SPR was performed on a GE Healthcare 
Biacore T200 with a running buffer containing 20 mM HEPES, pH 7.5, 100 mM 
NaCland 0.5% Tween-20, with a flow rate of 30 UL/min at 25 °C. A carboxymeth- 
ylated dextran (CM5) chip (GE Healthcare) was activated with N-hydroxysul- 
fosuccinimide (NHS) and 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide 


(EDC). We then either quenched the CM5 surface with ethanolamine (yielding 
a blank flow cell) or immobilized HCoV-NL63 S before quenching. 10 Ug of 
HCoV-NL63 § was diluted into 10 mM sodium acetate, pH 5.5 and was directly 
immobilized for 700 s, thus yielding 28,000 RUs. After immobilization quenching, 
running buffer was flowed for 10 min to ensure a steady baseline before experi- 
mental binding. Heparan sulfate (Sigma Aldrich) was reconstituted in running 
buffer at 5.0 mg/mL. Two concentrations of heparan sulfate, 5.0 mg/mL and 
2.5 mg/mL, were injected for 80 s with a dissociation time of 400 s. All data were 
subtracted from the blank flow cell, to account for any nonspecific interactions of 
heparan sulfate with the CM5 chip, and the baseline was normalized to 0. 
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